Generalization to Unseen Cases

نویسندگان

  • Teemu Roos
  • Peter Grünwald
  • Petri Myllymäki
  • Henry Tirri
چکیده

We analyze classification error on unseen cases, i.e. cases that are different from those in the training set. Unlike standard generalization error, this off-training-set error may differ significantly from the empirical error with high probability even with large sample sizes. We derive a datadependent bound on the difference between off-training-set and standard generalization error. Our result is based on a new bound on the missing mass, which for small samples is stronger than existing bounds based on Good-Turing estimators. As we demonstrate on UCI data-sets, our bound gives nontrivial generalization guarantees in many practical cases. In light of these results, we show that certain claims made in the No Free Lunch literature are overly pessimistic.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Artificial Intelligent Modeling and Optimizing of an Industrial Hydrocracker Plant

The main objective of this study is the modelling and optimization of an industrial Hydrocracker Unit (HU) by means of Adaptive Neuro Fuzzy Inference System (ANFIS) model. In this case, some data were collected from an industrial hydrocracker plant. Inputs of an ANFIS include flow rate of fresh feed and recycle hydrogen, temperature of reactors, mole percentage of H2 and H2S, feed flow rate and...

متن کامل

Long Short-Term Memory for Speaker Generalization in Supervised Speech Separation

Speech separation can be formulated as learning to estimate a time-frequency mask from acoustic features extracted from noisy speech. For supervised speech separation, generalization to unseen noises and unseen speakers is a critical issue. Although deep neural networks (DNNs) have been successful in noise-independent speech separation, DNNs are limited in modeling a large number of speakers. T...

متن کامل

Localized Generalization Error of Gaussian-based Classifiers and Visualization of Decision Boundaries

In pattern classification problem, one trains a classifier to recognize future unseen samples using a training dataset. Practically, one should not expect the trained classifier could correctly recognize samples dissimilar to the training dataset. Therefore, finding the generalization capability of a classifier for those unseen samples may not help in improving the classifiers accuracy. The loc...

متن کامل

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

As a step towards developing zero-shot task generalization capabilities in reinforcement learning (RL), we introduce a new RL problem where the agent should learn to execute sequences of instructions after learning useful skills that solve subtasks. In this problem, we consider two types of generalizations: to previously unseen instructions and to longer sequences of instructions. For generaliz...

متن کامل

Communicating Hierarchical Neural Controllers for Learning Zero-shot Task Generalization

The ability to generalize from past experience to solve previously unseen tasks is a key research challenge in reinforcement learning (RL). In this paper, we consider RL tasks defined as a sequence of high-level instructions described by natural language and study two types of generalization: to unseen and longer sequences of previously seen instructions, and to sequences where the instructions...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005